Graph Clustering : Complexity , Sequential and
نویسندگان
چکیده
In this thesis we study graph clustering on two well known families of graphs that arise in many applications, namely bipartite and chordal graphs. We study two speci c types of clustering problems. On the one hand we seek a partition of the vertex set of a graph into a bounded number of sets so that a prespeci ed function determined by the diameters of the induced subgraphs is minimized. On the other hand we seek one subset of the vertex set that has a xed number of vertices and induces the maximum possible number of edges. The rst problem type constitutes the main theme of this thesis. We de ne a class of functions on partitions of vertices, which we call monotone diameter functions, that will enable us to group many problems into one general problem. Examples of these problems are the following: partition the vertices of a graph into a limited number of subgraphs of bounded diameter, and partition the vertices of a graph into a limited number of subgraphs so that the sum of the diameters of the subgraphs is bounded. We prove that the problem of partitioning the vertices of a graph into subgraphs of bounded diameter is NP-complete on bipartite and chordal graphs. In contrast with these negative results, we give linear time sequential algorithms for the same problem on bipartite permutation and interval graphs, the rst being a subset of bipartite graphs and the second being a subset of chordal graphs. We also present e cient parallel algorithms for the general problem on biconvex and interval graphs. During the course of studying this problem on biconvex graphs, we prove that biconvex graphs have a certain ordering of the vertices, which we call a biconvex straight ordering, that has promising algorithmic potential. We present an e cient parallel algorithm for constructing a biconvex straight ordering for any biconvex graph. We also study the problem of nding a cluster of bounded size with the maximum possible number of edges, and present an e cient parallel algorithm for solving this problem on a class of unit interval graphs. Acknowledgements To God I owe everything. To everyone that helped make the completion of this thesis possible I o er my thanks. My supervisor, Lorna Stewart, provided me with expert guidance, many hours of discussion, and nancial support to complete the thesis and to attend conferences. I am grateful for her time, e ort, patience, and support. I am thankful to the other members of my committee: Stephen Hedetniemi, Jim Hoover, Joseph Culberson, and Andy Liu. Special thanks to Stephen Hedetniemi for suggesting many thoughtful improvements, to Jim Hoover for the time he spent with me discussing some of the parallel algorithms in this thesis, to Joseph Culberson for carefully reading the thesis, and to Andy Liu for pointing out a shorter proof for one of the theorems. To my parents, I am indebted with gratitude for their in nite support throughout all stages of the preparation of this thesis, and throughout all stages of my life. I am thankful to my husband, Ehab, for o ering me invaluable advice on thesis writing and for his patience and support. To my children, Aser and Salma, I am grateful for their love, smiles, laughter, and hugs.
منابع مشابه
Graph Clustering by Hierarchical Singular Value Decomposition with Selectable Range for Number of Clusters Members
Graphs have so many applications in real world problems. When we deal with huge volume of data, analyzing data is difficult or sometimes impossible. In big data problems, clustering data is a useful tool for data analysis. Singular value decomposition(SVD) is one of the best algorithms for clustering graph but we do not have any choice to select the number of clusters and the number of members ...
متن کاملSampling from social networks’s graph based on topological properties and bee colony algorithm
In recent years, the sampling problem in massive graphs of social networks has attracted much attention for fast analyzing a small and good sample instead of a huge network. Many algorithms have been proposed for sampling of social network’ graph. The purpose of these algorithms is to create a sample that is approximately similar to the original network’s graph in terms of properties such as de...
متن کاملAn Optimized Firefly Algorithm based on Cellular Learning Automata for Community Detection in Social Networks
The structure of the community is one of the important features of social networks. A community is a sub graph which nodes have a lot of connections to nodes of inside the community and have very few connections to nodes of outside the community. The objective of community detection is to separate groups or communities that are linked more closely. In fact, community detection is the clustering...
متن کاملFinding Community Base on Web Graph Clustering
Search Pointers organize the main part of the application on the Internet. However, because of Information management hardware, high volume of data and word similarities in different fields the most answers to the user s’ questions aren`t correct. So the web graph clustering and cluster placement in corresponding answers helps user to achieve his or her intended results. Community (web communit...
متن کاملCentralized Clustering Method To Increase Accuracy In Ontology Matching Systems
Ontology is the main infrastructure of the Semantic Web which provides facilities for integration, searching and sharing of information on the web. Development of ontologies as the basis of semantic web and their heterogeneities have led to the existence of ontology matching. By emerging large-scale ontologies in real domain, the ontology matching systems faced with some problem like memory con...
متن کاملMatrix Sequential Hybrid Credit Scorecard Based on Logistic Regression and Clustering
The Basel II Accord pointed out benefits of credit risk management through internal models to estimate Probability of Default (PD). Banks use default predictions to estimate the loan applicants’ PD. However, in practice, PD is not useful and banks applied credit scorecards for their decision making process. Also the competitive pressures in lending industry forced banks to use profit scorecards...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1995